Second - level Instruction Cache Thread Processing Unit Thread Processing Unit Thread Processing Unit Instruction Cache First - level First - level First - level Instruction Cache Instruction Cache Execution
نویسندگان
چکیده
This paper presents a new parallelization model, called coarse-grained thread pipelining, for exploiting speculative coarse-grained parallelism from general-purpose application programs in shared-memory multiprocessor systems. This parallelization model, which is based on the ne-grained thread pipelining model proposed for the superthreaded architecture 11, 12], allows concurrent execution of loop iterations in a pipelined fashion with run-time data-dependence checking and control speculation. The speculative execution combined with the run-time dependence checking allows the parallelization of a variety of program constructs that cannot be parallelized with existing run-time parallelization algorithms. The pipelined execution of loop iterations in this new technique results in lower parallelization overhead than in other existing techniques. We evaluated the performance of this new model using some real applications and a synthetic benchmark. These experiments show that programs with a suuciently large grain size compared to the parallelization overhead obtain signiicant speedup using this model. The results from the synthetic benchmark provide a means for estimating the performance that can be obtained from application programs that will be parallelized with this model. The library routines developed for this thread pipelining model are also useful for evaluating the correctness of the codes generated by the superthreaded compiler and in debugging and verifying the simulator for the superthreaded processor.
منابع مشابه
The Effect of Executing Mispredicted Load Instructions in a Speculative Multithreaded Architecture
Concurrent multithreaded architectures exploit both instructionlevel and thread-level parallelism in application programs. A single-threaded sequencing mechanism needs speculative execution beyond conditional branches in order to exploit more instruction-level parallelism. In addition, an aggressive multithreaded architecture should also use thread-level control speculation in order to exploit ...
متن کاملPrefetch Threads for Database Operations on a Simultaneous Multi-threaded Processor
Simultaneous Multi-threading (SMT) has been developed to increase instruction level parallelism by allowing instructions from a different thread to run during a stall. Inter-thread cache interference, however, might limit the benefit of running multiple independent threads. SMT processors can be utilized in a different model, where a helper thread is used to prefetch cache blocks for the main e...
متن کاملAn Instruction Cache Architecture for Parallel Execution of Java Threads
Designing a Java processor supporting horizontal multithreading has been becoming more attractive as network computing gains importance. Different from the traditional superscalar processors that issue multiple instructions from a single instruction stream to exploit the instruction level parallelism (ILP), the horizontal multithreading Java processors issue multiple instructions (bytecodes) fr...
متن کاملOptimising long-latency-load-aware fetch policies for SMT processors
Simultaneous Multithreading (SMT) processors fetch instructions from several threads and, in this way, the available Instruction Level Parallelism (ILP) of each thread is exposed to the processor. In an SMT processor the fetch engine has the additional level of freedom, compared to a super-scalar processor, to select independent instructions. The fetch engine determines how shared resources are...
متن کاملClustering Cores for Parallel Thread Execution
In recent years, we have observed a strong trend towards using accelerators, such as GPUs, to speed up scientific applications. This results in a complex heterogeneous system in which traditional CPUs are used for the execution of sequential threads, while GPUs are used for accelerating parallel threads. Instead of following this trend, this paper introduces a new explicitly parallel instructio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001